perm filename YYY[P,JRA] blob
sn#525181 filedate 1980-07-27 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .device xgp
C00028 00003 .begin center
C00029 ENDMK
C⊗;
.device xgp
.TURN ON "%","{";
.FONT 1 "baxl30"; <<normal font>>
.FONT 2 "BAXB30"; <<headings>>
.FONT 3 "BAXI30" <<mexpr font:basi30+>>
.<<FONT 3 "BAXI30[p,jra]" <<mexpr font:basi30+>>
.FONT 4 "SUB"; <<subscripts>>
.FONT 5 "set1" <<for meta vars in denotational crap>>
.FONT 6 "NGR20"; << font for sups & subs>>
.FONT 7 "grk30"
.FONT 8 "SUP"; <<superscripts>>
.<< prime under prime, pi under p>>
.FONT a "baxb30"; <<chapter titles and numbers>>
.<<FONT a "SET1"; <<fancy horse shit>>
.<<FONT c "FIX30"; <<for ∧ and ≡>>
.begin center
%2The Post Correspondence Problem as a Question of Ambiguity%1
.end
.group skip 6;
.turn on "#"
In courses dealing with computability the Post Correspondence Problem (PCP) is
usually
introduced by definition, shown to be unsolvable by reducing it to the halting
problem (Minsky, Manna), and disposed of as yet
another example of an unsolvable class of questions. Rarely do students get
any real understanding of what the PCP is and where difficulties lie in
finding an answer to a specific example.
A partial decision procedure can be very helpful in developing this understanding.
The Sardinas-Patterson test for ambiguity of a code provides an algorithm
that can be modified to search for answers to any specific example of the PCP.
First, we present the Sardinas-Patterson test. Secondly, we show how the test
can be modified to solve a problem similar but not equivalent to the PCP.
We then state the Post Correspondence Problem itself and develop a partial decision procedure by a
further modification to the algorithm.⊗↓It is suggested that anyone already
familiar with the Sardinas-Patterson test skip to section 2.←
.group skip 2
.begin center
%aSection 1: Sardinas Patterson Test%1
.end
.turn off "{", "}"
A %3code%1 is a set of code words, or strings, from some given finite alphabet,
e.g., for the alphabet {0, 1}, a sample code is {001, 01, 0100, 101}.
A code is ambiguous if there exists a string which can be interpreted, or parsed,
in two different ways. For example, with the given code, the string "0100101"
can be parsed as a string of three words, "01.001.01", or as a string of
two words, "0100.101".
The Sardinas-Patterson test provides an algorithm for determining whether or not
a given code is ambiguous, and, if it is, the test also provides the means of
constructing an example ambiguous string.
If %7a%1=%7bg%1, where %7a%1, %7b%1, and %7g%1 are strings, then %7b%1 is said to be a "prefix"
of %7a%1 and %7g%1 is called the "tail". If there exists an ambiguous string, then
it must begin with a code word which is the prefix of another code word. The tail
must then form a prefix of some code word. With
each tail we keep track of, we are saying that there is a string that is
ambiguous up to a point, but we have this overhang, this tail, which must still be
incorporated as a legal code sequence.
.turn on "{", "}"
%2Sardinas-Patterson Test%1⊗↓The algorithm given was presented by D. Huffman in
an Information Sciences course at the University of California at Santa Cruz.←:
.begin nofill
We make a table, entering the code words in column 0. We add entries in the rest
of the columns according to the following rules:
#####
1) to construct column 1, check column 0 to see if any of the code words
is a prefix of any other code word.
a) if there is no such pair of code words, then the code is not
ambiguous.
b) for each such pair that can be found, enter the "tail" in
column 1, and proceed.
####
2) to construct column n+1 (n≥1):
a) examine column n for any word13hich is a prefix of some word
in column 0. For each such pair, enter the tail in column n+1.
b) Examine column 0 for any word13hich is a prefix of some word
in column n. For each such pair, enter the tail in column n+1.
#####
3) Continue the process of creating new columns by computing tails until:
a) No entries are found for a new column - in this case the code
is not ambiguous.
b) A tail created in some column is one of the original code
words. In this case the code is ambiguous and an example can be
constructed from the entries in the table.
c) A column is repeated - at this point we know we shall be
forever repeating the same pattern, as the construction of new
columns depends only on the single previous column and the
original code in column 0. In this case the code is again
unambiguous.
#####
.end
For example, for the code given above, the algorithm yields the table of Figure 1.
The ambiguous string "0100101" can be constructed by tracing the history of the
marked entry. It yields the two parses "01.001.01." and "0100.101.".
.begin nofill
0 1 2 3
______________________________
001 00 1 01 *** this tail is also a code word!***
01
0100
101
Figure 1.
.end
The number of columns created in this process gives an indication of the delay
required, in the worst case, to resolve any temporary ambiguity which may occur
even when the code itself is unambiguous. If the algorithm stops in condition 3)c),
then there is no bound on this delay; we may have to wait until the entire message is
received before we can start decoding it.
.group skip 2
.begin center
%aSection 2: Modified PCP%1
.end
.turn on "↓"
.turn off "{", "}"
Suppose we are given a finite set of ordered pairs of strings
{(%7a%41%1,%7b%41%1), ...,(%7a%4n%1,%7b%4n%1)} and
asked to determine whether or not indices i%41%1, ..., i%4k%1 and j%41%1, ..., j%4r%1
exist such that %7a%4i↓1%1...%7a%4i↓k%1 = %7b%4j↓1%1...%7b%4j↓r%1.
That is, we must be
able to decide whether there exists a string that can be interpreted as consisting
either entirely of %7a%1's or entirely of %7b%1's, and produce the
string if it does exist.
This time we make a table with pairs of columns since we need to keep the
%7a%1's separated from the %7b%1's. To get started we must find an %7a%4i%1 and %7b%4j%1
such that one is the prefix of the other. The tail must then be interpretable as
the beginning of either a string of %7a%1's (if %7a%4i%1 was the prefix of %7b%4j%1) or
a string of %7b%1's (if %7b%4j%1 was the prefix of %7a%4i%1).
.turn on "#"
Suppose that we have already listed n columns in the table. Each entry %7g%1 in
column n%4α%1 indicates that a string %7s%1 can be constructed which may be
interpreted as consisting entirely of %7b%1's, or entirely of %7a%1's with the
string %7g%1 on the end. That is, the string %7s%1#=#a%41%1...a%4k%1a%4k+1%1...a%4n%1
(where each a%4i%1 is a letter of the code alphabet) is the same as
%7b%4j↓1%1...%7b%4j↓m%1 for some %4j↓1%1...%4j↓m%1, and the same as
%7a%4i↓1%1...%7a%4i↓p%7g%1 for some %4i↓1%1...%4i↓p%1.
If %7g%1 itself is an %7a%1, then we are done; but if not,
we must continue our search. If %7g%1 is a prefix of an %7a%4i%1, then we extend
our string of %7a%1's by that %7a%4i%1, resulting in the string we had before
with the addition of the new tail on the end. This string can be interpreted
as consisting entirely of %7a%1's, or entirely of %7b%1's with the new tail appended
to the end. Thus, we must add the new tail to the column (n+1)%4β%1.
If some %7a%4i%1 is a prefix of %7g%1, then we have extended our %7a%1-interpretation
of %7s%1, and the new tail is the "leftover" in the same sense that %7g%1 was.
Thus we add the new tail to column (n+1)%4α%1.
We do an analogous examination of column n%4β%1. If we ever find an %7a%4i%1 in
some column n%4α%1, or a %7b%4j%1 in some column n%4β%1, then we are through.
There is a solution string and it can be constructed by tracing the history of the
entry %7a%4i%1 in n%4α%1, or %7b%4j%1 in n%4β%1, as the case may be.
This discussion is summarized as the following algorithm:
.begin nofill
####
1) Enter the %7a%1's and %7b%1's in columns 0%4α%1 and 0%4β%1, respectively.
#####
2) Create columns 1%4α%1 and 1%4β%1 :
a) If any %7a%4i%1 is a prefix of any %7b%4j%1, enter the tail in 1%4α%1.
b) If any %7b%4i%1 is a prefix of any %7a%4j%1, enter the tail in 1%4β%1.
####
3) To create columns (n+1)%4α%1 and (n+1)%4β%1 :
a1) If any string in n%4α%1 is a prefix of any %7a%4i%1,
enter the tail in (n+1)%4β%1.
a2) If any %7a%4i%1 is a prefix of any string in n%4α%1,
enter the tail in (n+1)%4α%1.
.next page
b1) If any string in n%4β%1 is a prefix of any %7b%4i%1,
enter the tail in (n+1)%4α%1.
b2) If any %7b%4i%1 is a prefix of any string in n%4β%1,
enter the tail in (n+1)%4β%1.
####
4) Continue creating new columns according to step 3) until one of the
following conditions occurs:
a) The process closes out, that is, the entries in n%4α%1 and n%4β%1
are such that the columns (n+1)%4α%1 and (n+1)%4β%1 are empty. In
this case we can say that no solution is possible.
b) Some column n%4α%1 contains an %7a%4i%1, or some column n%4β%1 contains
a %7b%4i%1; then a solution exists and can be found by tracing the
history of such an entry through the table.
c) A pattern of columns develops which will continue infinitely,
that is, an %7a%1-%7b%1 pair of columns is repeated; then there is no
solution. New columns depend only on the single previous pair of
columns and the original columns 0%4α%1 and 0%4β%1. Therefore
once any n%4α%1-n%4β%1 pair of columns is repeated, the same
columns will follow the pair ad infinitum.
.end
One of the conditions 4)a) - 4)c) is guaranteed to occur.
Since "tails" have length less than some code, and there
is a maximum length among the %7a%4i%1 and %7b%4i%1, there is a maximum tail length,
and thus a finite number of possible tails. Therefore, only a finite
number of n%4α%1-n%4β%1 column pairs is possible.
.begin center
%aSection 3: Post Correspondence Problem%1
.end
The difference between the problem stated above and the Post Correspondence
Problem is that we were not restricted to using the same indexing sequence for both
%7a%1's and %7b%1's. The following statement of the PCP is taken from Minsky.
.begin indent 10,10,10
%2Post Correspondence Problem:%1
.begin single space
Given an alphabet A and a finite set of pairs of words (%7a%4i%1, %7b%4i%1) in the
alphabet A, is there a sequence i%41%1, i%42%1, ..., i%4n%1 of selections such
that the strings
.begin center; turn on "↓"
%7a%4i↓1%7a%4i↓2%1...%7a%4i↓n%1 and %7b%4i↓1%7b%4i↓2%1...%7b%4i↓n%1
.end
formed by concatenating--writing down in order--corresponding %7a%1's and %7b%1's
are identical?
.end
.end
.begin turnoff "{","}"
We can modify the table building procedure further to search for a
solution to the PCP, providing a partial decision procedure. If a solution
exists, then we will find it, but if there is no solution we may end up looking
forever as we lose the guarantee of termination provided by the algorithm above.
As an example, we construct a solution for the set of pairs {(1,111), (10111,10),
(10,0)} in Figure 2.
.end
We again build a table with column pairs.
This time, to get started we must find an %7a%4i%1 and %7b%4i%1 (with the %3same%1 index)
such that one is a prefix of the other. As we continue listing columns, we must
be careful to ensure that we are constructing a string interpretable as a sequence
of %7a%4i%1's and also as the %3same sequence%1 of %7b%4i%1's. Every time we make use
of an %7a%4i%1 we must also be able to use the corresponding %7b%4i%1.
.begin nofill
1) Enter the %7a%1's and %7b%1's in columns 0%4α%1 and 0%4β%1, respectively, being
careful to enter them in order.
####
2) Create columns 1%4α%1 and 1%4β%1:
a) If any %7a%4i%1 is a prefix of %7b%4i%1 (note the same index is required),
then enter the tail in 1%4α%1.
b) If any %7b%4i%1 is a prefix of the corresponding %7a%4i%1, then enter the
tail in 1%4β%1.
#####
3) Create columns (n+1)%4α%1 and (n+1)%4β%1:
a1) If any string in n%4α%1 is an %7a%4i%1, write the corresponding %7b%4i%1 in
column (n+1)%4α%1.
a2) If any string in n%4α%1 is a prefix of any %7a%4i%1, check the tail:
i) If the tail is the corresponding %7b%4i%1, then we have
a solution!
ii) If the tail is a prefix of the corresponding %7b%4i%1,
enter the %3new%1 tail (i.e., what is left of %7b%4i%1 after
deleting the tail from the front of it) in (n+1)%4α%1.
iii) If the corresponding %7b%4i%1 is a prefix of the tail,
enter the new tail in (n+1)%4β%1.
a3) If any %7a%4i%1 is a prefix of any string in n%4α%1, add the corre-
sponding %7b%4i%1 to the end of the tail, and enter the whole thing
in column (n+1)%4α%1. (e.g. In Figure 2, the entry "1111" in column
2%4α%1, results from "1", in 0%4α%1, being a prefix of "11", in 1%4α%1,
thus "111", %7b%41%1, was added to the tail "1". Again, in 3%4α%1,
"1", in 0%4α%1, is a prefix of "1111", in 2%4α%1, so "111", %7b%41%1,
is added to the tail, producing the entry "111111".)
####
b1) If any string in n%4β%1 is an %7b%4i%1, write the corresponding %7a%4i%1 in
column (n+1)%4β%1. (e.g. In column 1%4β%1, "111" appears, thus "1",
the corresponding %7a%4i%1, appears in 2%4β%1.)
b2) If any string in n%4β%1 is a prefix of any %7b%4i%1, check the tail:
i) If the tail is the corresponding %7a%4i%1, then we have
a solution!
ii) If the tail is a prefix of the corresponding %7a%4i%1,
enter the %3new%1 tail in (n+1)%4β%1.
iii) If the corresponding %7a%4i%1 is a prefix of the tail,
enter the new tail in (n+1)%4α%1. (e.g. In Figure 2,
column 2%4β%1 contains "1", which is a prefix of %7b%41%1;
the tail is "11"; the corresponding %7a%41%1 is "1",
which is a prefix of the tail "11", thus the %3new%1
tail is "1" and is entered in 3%4α%1.)
b3) If any %7b%4i%1 is a prefix of any string in n%4β%1, add the corre-
sponding %7a%4i%1 to the end of the tail, and enter the whole thing
in column (n+1)%4β%1.
####
.end
To facilitate the construction of our solution we subscript each entry in the table
with the index of the pair used in deriving it. We also use arrows to trace the
history of entries. Then, when we've found that a solution exists, we can simply
follow the path from the first column pair, noting the subscripts as we go, and we
are left with the desired list of indices. If a solution exists, it must be constructable
by the given rules.
.begin nofill
####
0%4α%1 0%4β%1 1%4α%1 1%4β%1 2%4α%1 2%4β%1 3%4α%1 3%4β%1 4%4α%1 4%4β%1
___________________________________________________________________
####
1 111 11%4(1)%1 1111%4(1)%1 111111%4(1)%1
10111 10 111%4(2)%1 1%4(1)%1 1%4(1)%1 111%4(1)%1
10 0 *%4(3)%1
####
Figure 2.
####
.end
When a solution does not exist, we may or may not be able to determine so. The
table might "close out" after a finite number of steps, that is, there may be no
more possible entries. In this case there is no solution. The column generating
process might also continue ad infinitum; this is detectable in special cases but
not in general. For example, if any column-pair is ever repeated, there is no
solution.
The difficulties arise in step 3) parts a3) and b3). The entries afforded by these
rules, and thus the tails used elsewhere, can grow arbitrarily long, thus ruining
our chances for a guarantee of termination of the process.
Of course, we have not shown that the problem is unsolvable. The motive for
introducing a partial decision procedure is to offer students some insight into
the problem itself. The proof of unsolvability may then be presented as usual,
by reduction to the Halting Problem or some other class of problems known to
the students.
.begin center
%aReferences%1
.end
.group skip 6
.turn on "#"
.nofill
Huffman, D., IS10:Introduction to Cybernetics, Information Sciences Board,
University of California at Santa Cruz, Spring Quarter, 1978.
#####
Manna, Z., %3Mathematical Theory of Computation%1, McGraw-Hill, New York,
New York, 1974.
###########################################################################
Minsky, M., %3Computation: Finite and Infinite Machines%1, Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, 1967.